Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation
نویسندگان
چکیده
This paper presents a geometric constrained transformation approach for fast acoustic adaptation, which improves the modeling resolution of the conventional Maximum Likelihood Linear Regression (MLLR). For this approach, the underlying geometry difference between the seed and the target spaces is exposed and quantified, and used as a prior knowledge to reconstruct refiner transforms. Ignoring dimensions that have minor affections to this difference, the transform could be constrained to a lower rank subspace. And only distortions within this subspace are to be refined in a cascaded process. Compared to previous cascade method, we employ a different parameterization and obtain a higher resolution. At the same time, since the geometric span for refiner transforms is highly controlled, it could be adapted quickly. So, it could achieve a better tradeoff between resolution and robustness. In Mandarin dialect adaptations, this approach provides 4~9% word-errorrate relative decrease over MLLR and 3~5% over previous cascade method correspondingly with varying amounts of data.
منابع مشابه
Discounted likelihood linear regression for rapid speaker adaptation
The widely used maximum likelihood linear regression speaker adaptation procedure suffers from overtraining when used for rapid adaptation tasks in which the amount of adaptation data is severely limited. This is a well known difficulty associated with the expectation maximization algorithm. We use an information geometric analysis of the expectation maximization algorithm as an alternating min...
متن کاملAn Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition
This paper presents an empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimizat...
متن کاملMaximum Likelihood Linear Regression (MLLR) for ASR Severity Based Adaptation to Help Dysarthric Speakers
Automatic speech recognition (ASR) for dysarthric speakers is one of the most challenging research areas. The lack of corpus for dysarthric speakers makes it even more difficult. The speaker adaptation (SA) is an alternative solution to overcome the lack of dysarthric speech and enhance the performance of ASR. This paper introduces the Severity-based adaptation, using small amount of speech dat...
متن کاملThe Speaker Adaptation of an Acoustic Model
This paper deals with several adaptation techniques, which are of the importance in cases when the identity of a speaker is known and we want to recognize his speech. We are using three different methods, namely Maximum Apriori Probability adaptation, Maximum Likelihood Linear Regression and Constrained Maximum Likelihood Linear Regression. Each of the methods yields various benefits, therefore...
متن کاملDiscriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation
We propose a discriminative fuzzy clustering maximum a posterior linear regression (DFCMAPLR) model adaptation approach to compensate the acoustic mismatch due to speaker variability. The DFCMAPLR approach adopts the MAP criterion and a discriminative objective function to estimate shared affine transform and fuzzy weight sets, respectively. Then, through a linear combination of the calculated ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003